An Efficient Association Rule Mining Using the H-BIT Array Hashing Algorithm
نویسندگان
چکیده
Association Rule Mining (ARM) finds the interesting relationship between presences of various items in a given database. Apriori is the traditional algorithm for learning association rules. However, it is affected by number of database scan and higher generation of candidate itemsets. Each level of candidate itemsets requires separate memory locations. Hash Based Frequent Itemsets Quadratic Probing (HBFI QP) algorithm, which is based on hashing technique for mining the frequent itemsets. In order to stay away from collisions and primary clustering in hashing process, Quadratic Probing (QP) technique is used. Though the primary clustering and collisions are eliminated, secondary clustering is formed in all cases and the hash table occupies more space than the total number of items in the database. To avoid those problems, the H-Bit Array Hashing (H-BAH) algorithm is presented in this paper. HBAH algorithm reduces hash table size required for placing items and it also avoids hash collisions and secondary clustering. The H-Bit array that is added to the first or header bucket of the table gives the information about which buckets are hashed initially. At the time of collisions in the hashing process, the H-BAH algorithm works by finding the neighbourhood of buckets near the original hashed bucket, in order to place the collided items quickly. The HBAH algorithm provides frequent itemsets with less computational time and memory than the existing algorithm.
منابع مشابه
A new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining
Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009...
متن کاملFast Vertical Mining Using Boolean Algebra
The vertical association rules mining algorithm is an efficient mining method, which makes use of support sets of frequent itemsets to calculate the support of candidate itemsets. It overcomes the disadvantage of scanning database many times like Apriori algorithm. In vertical mining, frequent itemsets can be represented as a set of bit vectors in memory, which enables for fast computation. The...
متن کاملRamp: High Performance Frequent Itemset Mining with Efficient Bit-Vector Projection Technique
Mining frequent itemset using bit-vector representation approach is very efficient for small dense datasets, but highly inefficient for sparse datasets due to lack of any efficient bit-vector projection technique. In this paper we present a novel efficient bit-vector projection technique, for sparse and dense datasets. We also present a new frequent itemset mining algorithm Ramp (Real Algorithm...
متن کاملCompressed Image Hashing using Minimum Magnitude CSLBP
Image hashing allows compression, enhancement or other signal processing operations on digital images which are usually acceptable manipulations. Whereas, cryptographic hash functions are very sensitive to even single bit changes in image. Image hashing is a sum of important quality features in quantized form. In this paper, we proposed a novel image hashing algorithm for authentication which i...
متن کاملAn Incremental Mining Algorithm for Association Rules Based on Minimal Perfect Hashing and Pruning
In the literatures, hash-based association rule mining algorithms are more efficient than Apriori-based algorithms, since they employ hash functions to generate candidate itemsets efficiently. However, when the dataset is updated, the whole hash table needs to be reconstructed. In this paper, we propose an incremental mining algorithm based on minimal perfect hashing. In our algorithm, each can...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013